A new pitch synchronous time domain phoneme recognizer using component analysis and pitch clustering

نویسندگان

  • Ramon Prieto
  • Jing Jiang
  • Chi-Ho Choi
چکیده

A new framework for time domain voiced phoneme recognition is shown. Each speech frame taken for training and recognition is bounded by consecutive glottal closures. A preprocessing stage is designed and implemented to model pitch synchronous frames with gaussian mixture models. Component analysis carried out on the data shows optimal performance with a very small number of components, requiring low computational power. We designed a new clustering technique that, using the pitch period, gives better results than other well known clustering algorithms like k-means.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text-to-Speech Synthesis using Phoneme Concatenation

We proposed Text-To-Speech (TTS) synthesis system based on phonetic concatenation for unrestricted input text. The input text is first converted into phonetic transcription using Letter-to-Sound rules. For synthesis of a new speech, TTS system selects the recorded phoneme units (PUs) from database and modifies the duration according to the rule based on spelling using Time Domain Pitch Synchron...

متن کامل

An Intonational Phrase Boundary and Pitch Accent Dependent Speech Recognizer

Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. We describe the idea of prosody dependent speech recognition by building a prosody dependent speech recognizer that conditions word and phoneme models on two important prosodic variables: intonational phrase bou...

متن کامل

Classification of Iranian Traditional Music Dastgahs Using Features Based on Pitch Frequency

The Iranian traditional music is composed of seven majors Dastgahs: Chahargah, Homayoun, Mahour, Segah, Shour, Nava, and Rast-Panjgah. In this paper, a new algorithm for the classification of the Iranian traditional music Dastgahs based on pitch frequency is proposed. In this algorithm, the features of Lagrange coefficients of pitch logarithm (LCPL), Fuzzy similarity sets type 2 (FSST2), and th...

متن کامل

Robust Controller Design for IG Driven by Variable-Speed in WECS Using μ-Synthesis

This paper presents robust controller design for a wind-driven induction generator system using structured singular value ( -synthesis) method. The controller was designed for a static synchronous compensator (STATCOM) and a variable blade pitch angle in a wind energy conversion system (WECS) in order to achieve the required voltage and mechanical power control. The results indicated that this ...

متن کامل

Experiments on Chinese Speech Recog Pitch Estimation Using the M

Automatic speech recognition of a tonal and syllabic language such as Chinese Mandarin poses new challenges but also offers new opportunities. We present approaches and experimental results concerning the choice of base units for acoustic modeling, pitch estimation and how to integrate pitch estimates into the modeling framework. The experimental evaluations are carried out both on rather clean...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003